A new system identification approach to identify genetic variants in sequencing studies for a binary phenotype.

نویسندگان

  • Guolian Kang
  • Wenjian Bi
  • Yanlong Zhao
  • Ji-Feng Zhang
  • Jun J Yang
  • Heng Xu
  • Mignon L Loh
  • Stephen P Hunger
  • Mary V Relling
  • Stanley Pounds
  • Cheng Cheng
چکیده

We propose in this paper a set-valued (SV) system model, which is a generalized form of logistic (LG) and Probit (Probit) regression, to be considered as a method for discovering genetic variants, especially rare genetic variants in next-generation sequencing studies, for a binary phenotype. We propose a new SV system identification method to estimate all underlying key system parameters for the Probit model and compare it with the LG model in the setting of genetic association studies. Across an extensive series of simulation studies, the Probit method maintained type I error control and had similar or greater power than the LG method, which is robust to different distributions of noise: logistic, normal, or t distributions. Additionally, the Probit association parameter estimate was 2.7-46.8-fold less variable than the LG log-odds ratio association parameter estimate. Less variability in the association parameter estimate translates to greater power and robustness across the spectrum of minor allele frequencies (MAFs), and these advantages are the most pronounced for rare variants. For instance, in a simulation that generated data from an additive logistic model with an odds ratio of 7.4 for a rare single nucleotide polymorphism with a MAF of 0.005 and a sample size of 2,300, the Probit method had 60% power whereas the LG method had 25% power at the α = 10(-6) level. Consistent with these simulation results, the set of variants identified by the LG method was a subset of those identified by the Probit method in two example analyses. Thus, we suggest the Probit method may be a competitive alternative to the LG method in genetic association studies such as candidate gene, genome-wide, or next-generation sequencing studies for a binary phenotype.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Molecular Detection of Novel Genetic Variants Associated to Anaplasma ovis among Dromedary Camels in Iran

To the best of our knowledge, little information is available regarding the presence of Anaplasma species in camels in Iran. This study sought to investigate the presence of Anaplasma species by microscopy and polymerase chain reaction (PCR) assays in 100 healthy dromedaries (Camelus dromedarius) arriving for slaughter. The microscopic examination of Giemsa-stained blood films revealed that Ana...

متن کامل

SVSI: fast and powerful set-valued system identification approach to identifying rare variants in sequencing studies for ordered categorical traits.

In genetic association studies of an ordered categorical phenotype, it is usual to either regroup multiple categories of the phenotype into two categories and then apply the logistic regression (LG), or apply ordered logistic (oLG), or ordered probit (oPRB) regression, which accounts for the ordinal nature of the phenotype. However, they may lose statistical power or may not control type I erro...

متن کامل

Genetic and Memetic Algorithms for Sequencing a New JIT Mixed-Model Assembly Line

This paper presents a new mathematical programming model for the bi-criteria mixed-model assembly line balancing problem in a just-in-time (JIT) production system. There is a set of criteria to judge sequences of the product mix in terms of the effective utilization of the system. The primary goal of this model is to minimize the setup cost and the stoppage assembly line cost, simultaneously. B...

متن کامل

I-38: Search for Genetic Causes of Male Infertility

Background: We are convinced that better infertility treatment will only be achieved with a better under understanding of the molecular mechanisms specific to each patient. To that effect we want to indentify genes involved in male infertility. Materials and Methods: We screened cohorts of infertile men to identify the cause of their infertility. Results: Our team has identified and caracterize...

متن کامل

Strategic approaches to unraveling genetic causes of cardiovascular diseases.

DNA sequence variants are major components of the "causal field" for virtually all medical phenotypes, whether single gene familial disorders or complex traits without a clear familial aggregation. The causal variants in single gene disorders are necessary and sufficient to impart large effects. In contrast, complex traits are attributable to a much more complicated network of contributory comp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Human heredity

دوره 78 2  شماره 

صفحات  -

تاریخ انتشار 2014